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Long-range dependence and non-Gaussianity are ubiquitous in many natural sys- 
tems like ecosystems, biological systems and climate. However, it is not always 
appreciated that both phenomena may occur together in natural systems and that 
self-similarity in a system can be a superposition of both phenomena. These fea- 
tures, which are common in complex systems, impact the attribution of trends and 
the occurrence and clustering of extremes. The risk assessment of systems with 
these properties will lead to different outcomes (e.g. return periods) than the more 
common assumption of independence of extremes. 

Two paradigmatic models are discussed which can simultaneously account for 
long-range dependence and non-Gaussianity: Autoregressive Fractional Integrated 
Moving Average (ARFIMA) and Linear Fractional Stable Motion (LFSM). Sta- 
tistical properties of estimators for long-range dependence and self-similarity are 
critically assessed. It is found that the most popular estimators can be biased in 
the presence of important features of many natural systems like trends and multi- 
plicative noise. Also the long-range dependence and non-Gaussianity of two typical 
natural time series are discussed. 

Keywords: Memory, Paradigmatic Models, Multiplicative Noise 

1. Introduction 

Meteorology has been both a seedbed and a testbed for many advances in the 
mathematics of complex systems; the former being seen in its contribution of the 
Lorenz model to nonlinear science (Lorenz 1963), while the latter is exemplified 
by ensemble forecasting. This parallel evolution of theory and practise has iden- 
tified important problems, e.g., the need to model the effect of extra, noise-like, 
degrees of freedom on deterministic low dimensional dynamics. Other well estab- 
lished paradigms such as Hasselmann's stochastic model (Hasselmann 1976), have 
highlighted the importance of red noise to mathematical climatology. 

However, since the pioneering work of the late Bcnoit Mandelbrot, increasing 
attention is paid to two self-similar (or "fractal") aspects of time series: long-range 
dependence (LRD) in time, and the spatial counterpart of LRD, heavy tailed prob- 
ability distributions in amplitude. First identified in hydrology, and since studied 



Article submitted to Royal Society 



TgX Paper 



2 C. L. E. Franzke, T. Graves, N. W. Watkins, R. B. Gramacy and C. Hughes 



in research areas as diverse as biology, telecommunications, social networks, econo- 
metrics and the climate system, LRD is characterised by its low frequency singular 
behaviour, the so-called 1// power spectrum. When present in a signal the corre- 
lations captured by LRD will both hamper the identification and the attribution 
of deterministic trends (e.g. Franzke 2010), and impede the quantification of their 
significance. As was the case in the original context of hydrology, the presence of 
LRD in climate time series has been intensely debated, and it is still sometimes 
ascribed to transient effects, calibration issues or other forms of nonstationarity 
(Maraun et al. 2004, Rust et al. 2008). 

LRD is also of practical significance. By making systems subject to longer-lived 
fluctuations (Beran 1994) it changes the information available to make predictions 
about the state of a complex system. It impacts the occurrence and clustering of 
extremes (Bunde et al. 2005, Bogachcv et al. 2008, Kropp and Schcllnhubcr 2010) 
which arc important for risk assessments and mitigation strategics, e.g. insurance 
pricing and flood defence. The traditional assumption is that extremes arc indepen- 
dent events. But there is growing evidence for clustering of extreme extra-tropical 
storms (e.g. Mailier et al. 2006) and precipitation events (e.g. 2007 United Kingdom 
floods). This clustering of extremes will lead to higher return periods of extreme 
events than the more common assumption of independence. Recently, detrended 
fluctuation analysis (DFA), a tool originally developed to detect LRD, has found 
application in the prediction of dangerous bifurcations in dynamical systems such 
as climate "tipping points" (Livina and Lenton 2007, Lenton et al. 2011, Sicbcr 
and Thompson 2011). Our contribution to this volume is focused both on i) the 
problem of accurately estimating LRD in the presence of other signal elements, and 
ii) the complicating effects of heavy-tailed amplitude distributions, when they are 
present. 

A process is long range dependent when the prediction of its next state de- 
pends on the whole of its past. An imprint of this dependence structure is that the 
covariance r(k) = Cov(X(k),X(Q)) decays slowly, as k — > oo, so that 



This slow decay of the covariances means that the values of the process X are 
strongly dependent over long periods of time. This contrasts with the more famil- 
iar short-range dependent system where J2"kLo l r (^)l = C* < oo. In a short-range 
dependent process the next state only depends on the current state and the recent 
past. The archetype of a short-range dependent process is a first order Markov 
process where the next state depends only on the present state and is conditionally 
independent of past states. See Beran (1994) for more details. 

Long-range dependence of a system is characterised by the parameter d in the 
statistics literature and sometimes by Mandelbrot's Joseph parameter J, for exam- 
ple in the physics literature. Temporal long-range dependence has been detected in 
the water level of the river Nile (Hurst 1951, Hurst et al. 1965). It was observed that 
the range of values grows as t j , where J refers to the Joseph exponent (e.g. Man- 
delbrot 2001, p. 157) and r to the time period under consideration. The growth of 
range was anomalously large compared to that in the familiar paradigms of random 
walks and Brownian motion, which have both played a central role in understanding 
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diffusion. In the random walk, the variance grows as the square root of time r 1 / 2 
(Einstein 1905). The subsequent theoretical explanation of the difference between 
the observed anomalous diffusion and the Brownian motion paradigm came first 
through the use of a self-similar process, in particular via the study of fractional 
Brownian motion (fBm; Mandelbrot & Wallis 1968). Here, self-similarity means 
that the statistical properties on all scales are similar. This property is controlled 
by the self-similarity parameter H . A stochastic process S is self-similar when a 
rescaling of time by a factor A leads to a rescaling of the amplitude of the process 
S by \ H . Thus, a process is said to be i?-self-similar when the following relation 
holds (Embrechts and Maejima 2002): 

S(Xt) ~ \ H S(t). (1.2) 

where ~ denotes equality in distribution. It is not always appreciated, and is some- 
times confused in the literature, that two different aspects of a time series can 
contribute to its self-similarity H. The first is LRD, which is synonymous with 
persistence, referred to by Mandelbrot and Wallis (1968) as the Joseph effect. Per- 
sistent systems exhibit longevity by having a tendency to maintain the way they 
have been recently. Examples are heat waves and drought conditions. The second 
source of self-similarity identified by Mandelbrot, the so-called Noah effect is the 
property that change in a system can be rather large and can occur very abruptly 
i.e. time series drawn from systems can exhibit sharp discontinuities, e.g., Earth 
quakes. 

It may at first seem odd that both phenomena occur simultaneously in natural 
systems because they pull in opposite directions. On reflection, taken together, 
however, we see that the two effects capture the facts that coherent structures in 
nature are real and that they can emerge, change or even vanish very quickly. The 
archetypal model of the Noah effect is non-Gaussian jumps whose complementary 
cumulative distribution function decays with a power law in size s, p(s) ~ s~ a . 
Here, a denotes the stability exponent (referred to as tail exponent in the statistics 
literature). Thus, the self-similarity parameter is 

H = J -]- + - = d+-. He (0,1), (1.3) 
2 a a 

The distinction between H and J = d+ ^ is important because not all observed time 
series have Gaussian fluctuations. As such one may find that popular diagnostics, 
as shown below, may be insensitive to heavy tails, measuring J and not H. However, 
in many situations both heavy tails and the clustering of extremes caused by LRD 
are very important and both contribute to the value of H. This makes it necessary 
to be able to measure both H and J independently. 

Many estimators for the LRD exponent d have thus been developed, and are 
widely used in the respective literatures. Much of what has been rigorously estab- 
lished about the estimators, however, is for a particular LRD Gaussian model: fBm, 
which is also H self-similar and has a particularly simple relation between H and J, 
i.e. if = J + i — | = J (a = 2 for Gaussian increments). Most observed time series 
depart from the assumptions of fBm in some way or another, so it is important 
to critically evaluate how sensitive the estimators are to deviations from fractional 
Gaussian noise (fGn). fBm is a random walk with long-range correlated increments 
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and the value of J in such iJ-sclf-similar walks is directly connected to the presence 
of LRD in their increments. 

In this study we critically re-evaluate 4 popular methods for measuring d and 
two for measuring H : variable bandwidth estimator (Schmittbuhl et al. 1995) and 
wavelets (WL; Stoev and Taqqu 1995), Detrcndcd Fluctuation Analysis (DFA; Peng 
et al. 1994), the Re-Scaled Range analysis (R/S; Hurst 1951, Mandelbrot 2001), 
exact Whittle estimator (Shimotsu and Phillips 2005, 2006) and semi-parametric 
estimators (in this study we will focus on the Geweke-Porter-Hudak (GPH) estima- 
tor; Geweke & Porter-Hudak 1983; semi-parametric estimators are also described 
in Bardet et al. 1998, Robinson 1995a, 1995b, Hurvich et al. 2005). This is a not 
a complete set of estimators for the self-similarity or long-range dependence pa- 
rameter (other estimators, which are well established in the statistics community 
include FEXP (Hurvich et al. 2002)). Here we want to focus on some of the most 
popular estimators in the physics and geosciences communities (R/S: Tuck and 
Hovdc 1999, Price and Newman 2001, Ogurtsov, 2004, Scipioni et al, 2008, Ghil et 
al. 2011, DFA: Chen et al. 2002, Bunde et al. 2005, Bashan et al. 2008, Ghil et al 
(2011), GPH: Vyushin and Kushner 2009, Huybers and Curry 2006, Franzke 2010, 
VB: Escorcia-Garcia et al. 2009, WL: Stoev and Taqqu 2005, Stoev et al. 2005). 

We show scatter plots which compare the imposed parameters for a number of 
trials with the values detected by the above methods. These have the advantage of 
visually indicating which methods measure the complete self-similarity exponent H , 
and which, instead, measure long range dependence via d. For evaluating the per- 
formance of these estimators we employ 2 well-established paradigmatic time series 
models: A self-similar process (Linear Fractional Stable Motion (LFSM); Samorod- 
nitsky & Taqqu 1994) and an asymptotically self-similar process (Autoregressive 
Fractional Integrated Moving Average (ARFIMA); e.g. Beran 1994). 

In most natural systems trends are common properties, so we systematically 
examine the performance of the various estimators when a linear trend is superposed 
on each time series. This is an important issue because in practice it is not always 
easy to remove trends. That they typically manifest themselves in higher moments 
presents further challenges still. Trends are the hardest to deal with because long- 
range dependent processes can produce apparent trends over rather long periods of 
time (e.g. Franzke 2010). Therefore, it can be difficult to decide if a trend is due to 
external forcing or due to finite time series length. 

In section 2 we present the two paradigmatic time scries models in more detail 
and discuss various special cases. In section 3 we introduce the estimators of long- 
range dependence and self-similarity. We test their utility in section 4. In section 

5 we discuss the LRD and self-similarity properties of two exemplary time series 
from nature and in section 6 we provide conclusions. 

2. Paradigmatic Models of Natural Time Series 

(a) Natural Time Series Examples 

The most prominent and widely used paradigmatic LRD time series model, 
especially in the physics literature, is fGn. However, most natural time series are 
not strictly Gaussian and many are actually highly non-Gaussian. To illustrate this 
point we discuss two time series from nature. In Fig. [T^, we display monthly mean 
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Figure 1. a) Faraday- Vernadsky station temperature time series in degree Celsius (annual 
cycle is subtracted) (01.01.1951 - 28.02.2007), b) Probability density function of the Fara- 
day- Vernadsky station temperature time series estimated by a Kernel density estimator: 
Solid line: whole year; Dashed line: Cold season data May through September; Dashed- 
Dotted line: Warm season data November through March, c) Auroral electrojet index 
from January through June 2006 in nanotesla and d) Probability density function of the 
Auroral electrojet index. 



temperature from the Antarctic station Faraday- Vernadsky (Turner et al. 2004, 
Franzkc 2010). It is easy to see that the raw time series is highly non-Gaussian. 
This is confirmed by the probability distribution functions (PDFs) plotted in Fig. 
[TJd. There is also a striking seasonal signal visible in the skewness of the PDFs. 
However, it must to be noted that polar temperature time series are much more non- 
Gaussian than mid-latitude ones which are typically nearly Gaussian. Our second 
exemplary time series is the auroral electrojet (AE) index (Fig. [TJ e.g., Davis & 
Sugiura 1966, Watkins 2002). The AE index is derived from 1 minute resolution 
time series from 12 high latitude magnetometers. Reflecting the intermittent nature 
of the ionospheric and solar wind processes which influence it, the AE index is seen 
to be spiky and strongly non-Gaussian (Fig. [TH). Clearly, many natural time series 
arc non-Gaussian. 



(b) Paradigmatic Models 

While fGn and fBm are Gaussian, they are still useful idealised paradigmatic 
models for understanding many observed phenomena. A paradigmatic model is an 
idealised framework which captures some properties of observed time series, though 
not all. Most paradigmatic models allow analytical work, although at the expense 
of an over-idealisation of the physical or biological phenomena. There is a need 
for better paradigmatic models of observed phenomena which allow simultaneously 
non-Gaussian statistics and LRD. Such models are needed e.g., for hypothesis test- 
ing and time series simulation. Thus, we briefly discuss two paradigmatic time series 
models which exhibit LRD and non-Gaussianity. The widely used fGn and fBm arc 
special cases in the Gaussian limit of these more general models, thereby retaining 
some of the analytical tractability of these models. 

In this study we use two classes of processes with symmetric cv-stable (SaS) 
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distributions. They are the LFSM and the ARFIMA model with SaS innovations. 
Stoev and Taqqu (2004) present efficient methods for the simulation of these pro- 
cesses using the Fast Fourier Transform. 

(i) Linear Fractional Stable Motion 

LFSM is a model which exhibits LRD and heavy tails at the same time. It 
is an extension to the simpler Brownian random walk model. It links individual 
heavy tailed jumps by means of a retarded memory kernel. It can be represented 
by a stochastic process Xu,a = {^H,a(t),t G R}Q which is defined by the following 
stochastic integral 

X H , a (t) = C H ] a f ((t - - dL a (s), (2.1) 

where < H < 1, a S (0,2), and where L a = {L a (s),s £ R} is a standard 
symmetric a-stable (SaS) Levy process. The process Xh,cx is called LFSM and is 
self-similar with self-similarity parameter H. Thus, it satisfies relation (jl.2|) and has 
stationary increments. The parameter a controls the tails of the distribution of an 
SaS random variable £, that is, 

P > x) ~ x~ a , as x -> oo. (2.2) 

The greater the value of a, the lower the probability of extreme fluctuations of the 
SaS process Xn, a - We recommend the introductions by Taqqu (2003) and Mercik 
et al. (2003) for detailed expositions. 

(ii) Fractional Brownian Motion (fBm) 

In the Gaussian case we have that a = 2 and the LFSM process (|2.1[) reduces 
to fBm. In this case the self-similarity parameter H will always equal the Joseph 
exponent J, thus, H = J = d + \. 

(iii) Ordinary Levy Motion (oLm) 

In the memory- less case we have d = and < a < 2. In this case equation 
(|2.2[) holds and the tails of the distribution decay according to a power-law and 
consequently L a has infinite variance and is called ordinary Levy motion. This 
encapsulates the Noah effect. In this case the self-similarity exponent is H = — . 

(iv) Autoregressive Fractional Integrated Moving Average 

Another paradigmatic model exhibiting LRD with heavy-tailed fluctuations is 
ARFIMA, which has the added ability of exhibiting short-range dependent be- 
haviour. This model, denoted XRFlMA(p,d,q), p,q E N, extends the usual Au- 
toregressive Integrated Moving Average ARIMA(p,d,q) models, in which d takes 
integer values (Box & Jenkins 1970). An ARIMA model is written as: 

$(B)(l-B) d X t = y(B)e t , (2.3) 

f This is the traditional way of defining LFSM, though H has been defined in 11.31 1 as the 
dependent and d and a as the independent variables. 
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where B denotes the back shift operator defined by BX t = X t -i,B 2 X t = X t -2--- 
The polynomials $ and $ are defined as 4>(a;) = 1 — Yjj=\ a i x ^ anc ^ ^( x ) = 
1 + 'Y3j=\°j x ^ where p and q are integers. The innovations e t (t = 1,2...) are 
usually independent and identically distributed (iid) normal variables with zero ex- 
pectation and variance of, but can also be a-stable distributed. The widely used 
autoregressive process of first order (AR(1)) is a special case with p = 1, d = and 
q = 0. Typically d is an integer. In the case that d is a fractional real number, X(t) 
is an ARFIMA(p,d,g) process with — \<d<\ and exhibits LRD. In the Gaussian 
case ARFIMA is stationary. An ARFIMA (0,d, 0) is asymptotically equivalent to 
fflm (Taqqu 2003). An ARFIMA(p, d, q) with p > and q > is long-range de- 
pendent but not self-similar. However, it is asymptotically self-similar for long time 
scales. 



3. Estimators of Long-Range Dependence and Self-Similarity 

Here we briefly describe the four estimators of the self-similarity and LRD param- 
eters used in this study. 

(a) Variable Bandwidth 

The variable bandwidth (VB) method (Schmittbuhl et al. 1995) is a technique 
for estimating the self-similarity exponent, H, from a time series x(t). The time 
series of length T is divided into windows or 'bands' of width r. The VB method 
can deploy two different algorithms to estimate H . (i) The standard deviation of 
the time series, cr(r), is computed in each band; or (ii) the difference between the 
maximum and minimum values in each band, e(r), is computed. Then tr(r) and e(r) 
are averaged over all the possible bands by varying the origin at fixed r 

VB w (r) = —J^o-iir) and VB 4 (r) = — V e 4 (r), (3.1) 

i— 1 i— 1 

where L r is the number of windows of length r. This is repeated over a range of 
window sizes. Both quantities follow a power-law behaviour for self-similar time 
series (Schmittbuhl et al. 1995) such that VB w (r) = r H and VB 5 (r) = r H . Thus, 
the self-similarity exponent, H, is obtained from the slope of the corresponding 
log-log plot. 

(b) Wavelets 

A wavelet ip is a function with zero average and is normalised to one. A fam- 
ily of wavelets is generated by scaling ip by a factor s and translating it by u 
(VVs(*) = -^'4> (^r))- The wavelet transform allows to construct a time-frequency 
representation of a signal, the wavelet spectrum. One can then infer the self- 
similarity parameter from the wavelet spectrum via ordinary least squares at large 
wavelet scales (Abry and Veitch 1998, Stoev and Taqqu 2005). 
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(c) Rescaled Range 

The rescaled range R/S (Hurst 1965) is a technique for estimating the LRD 
parameter d from a time series. The R/S estimator is given by 

g/g(T? = E ^x E ^)-mma(t,r) 

where 1 < t < r, x(t, r) = J2u=i x ( u ) ~ ( x )r, ( x )t = \ J2t=i x (t)- ^ scales like t j 
where the value of J can be estimated from the slope of R/S from a log-log plot. 



(d) Detrended Fluctuation Analysis 

Dctrendcd Fluctuation Analysis (DFA; Peng et al. 1994) also estimates the 
LRD parameter d from a time series. In DFA, a profile is first computed by Y(i) = 
J2t=i x (t)- The profile is cut into N s non-overlapping segments of equal length s 
and then the local trend is subtracted for each segment v by a polynomial least- 
squares fit of the data. Linear (DFA1), quadratic (DFA2), cubic (DFA3) or higher 
order polynomials can be used for detrending. In the n th order DFA, trends of order 
n in the profile, and of order n — 1 in the original record, are eliminated. Next the 
variance for each of the N s segments is calculated by averaging over all data points i 
in the v th segment: 

F?(v) = (y s 2 «> = -J2 Y s H v + • ( 3 - 3 ) 
s £ — ' 

i=l 

Finally, the average over all segments is computed and the square root is applied 
to obtain the following fluctuation function 



1 

s v—l 

For different detrending orders, n, we obtain different fluctuation functions F(s), 
which are denoted by F^ (s). The fluctuation function scales according to F^ (s) ~ 
s'', with d = £ — |. There are many variants of DFA, but use of standard DFA is 
recommended by Basahn et al. (2008) if the functional form of a trend is not a 
priori known. 



\ 



(e) Exact Whittle Estimator 

The Exact Whittle estimator is a semi-parametric estimator (Shimotsu and 
Phillips 2005, 2006). This method assumes that the underlying model of LRD can 
be represented by (1 — B) d X t = e t , where e is iid noise. See Shimotsu and Phillips 
(2005, 2006) for more details. 



(/) Power Spectral Method 

As a semi-parametric estimator we use the power spectral method of Geweke 
& Porter- Hudak (1983) and Hurvich et al. (2001). Spectral methods find d by 
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Figure 2. Estimates of H for walks derived from fractional Gaussian noise (fGn) and 
ordinary Levy noise (oLn) for Wavelet method and the Variable Bandwidth estimator 
VB (crosses denote VB$ and circles VB™). Estimates of d for fGn and oLn for Whittle 
estimator, R/S, DFA1 and GPH. The solid line indicates the true self-similarity parameter 
H (d in the case of fGn or 1/a for oLn). 

estimating the spectral slope. The periodogram is used, which is an estimate of the 
spectral density of a finite-length time series and is given by: 



1 



N 



-i2irt\j 



3 1 V/2], 



(3.5) 



where Xj = j/N is the frequency and the square brackets denote rounding towards 
zero. A series with LRD has a spectral density proportional to |Ap 2d close to the 
origin. Since S(X) is an estimator of the spectral density, d is estimated by a re- 
gression of the logarithm of the periodogram versus the logarithm of the frequency 
A. Thus having calculated the spectral density estimate S(X), semi-parametric es- 
timators fit a power law of the form /(A, b,d) = b \X\ d , where b is the scaling factor. 



4. Empirical Tests 

In order to examine the statistical properties of the LRD estimators, like bias, 
spread and outliers, we generate various test time series from the above paradig- 
matic models. Specifically, we generate ensembles of 1000 members with randomly 
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selected parameters from uniform distributions: fGn with d E (—0.5,0.5), oLn 
with a E (1,2), ARFIMA with Gaussian increments ARFIMA-G(1, d, 1) and with 
ordinary Levy increments ARFIMA-L(1, d, 1) with the autorcgrcssive coefficient 
di E (— |, g), and the moving average coefficient bi E (0,1), d E (—5,5)' an d 
a E (1,2). We also use various first order autorcgrcssive processes with parameters 
randomly chosen in (-1,1). The time series length is always 2 15 — M with M = 6000 
(see Stoev & Taqqu (2004) for an explanation of M) which is comparable to the 
length of most observed climatic and other natural time series, thus long enough 
for our purposes. 

(a) Paradigmatic Time Series 

First we examine how well the SS and LRD estimators work for the models of 
self-similarity Most of the methods (e.g. DFA and R/S) require a regression fit in 
order to estimate the SS or LRD parameters. As shown by Chen et al. (2002) non- 
stationarities and short-range dependencies can cause crossovers in the fluctuation 
curves. Because of this we only regress on the long-range part of the fluctuation 
curves. Our results are robust to the particular choice of the cut off. 

As can be seen in Fig. [5]the Wavelet and the VB methods are the only methods 
which are able to infer the self-similarity of oLn. While the Wavelet method has 
no large estimation spread, the VB method exhibits with large errors. All other 
estimators estimate d rather than H, with R/S producing the largest estimation 
variance, while Whittle, DFA1 and GPH have considerably smaller estimation vari- 
ances but with the odd outlier (Fig. [2]). All four estimators do a reasonably good 
job of inferring H from fGn with VB m (r) having the largest bias for H close to 
and close to 1, and R/S also having some bias at small H with a relatively large 
estimation variance. Wavelet, DFA1 and GPH produce very tight estimates, with 
Wavelet and DFA1 only biased for small values, and Whittle and GPH working well 
over the entire range. For these paradigmatic time series the two semi-parametric 
estimators, Whittle and GPH, give the best estimates. This picture changes once 
we allow for short-term dependence structures to contaminate the pure self-similar 
character of fGn and oLn. As Fig. [3] shows all estimators work considerably worse 
for ARFIMA surrogate data with large estimation spreads and many outliers. Again 
the Whittle estimator and GPH perform better than the other estimators. 

Before we go on to examine how well the estimators work with superimposed 
trends, we test their accuracy on data generated from three basic short-range de- 
pendence models: independent white noise, AR(1) and ARFIMA(1,0,1). As Figs. 
Hk, b and c show, Whittle estimator, GPH and DFA have the least bias. But it also 
has to be noted that all estimators have many outliers, suggesting that given an 
individual time series the estimates of d or H can be severely biased. The perfor- 
mance of higher order DFA is very similar to DFAl in the above test cases (not 
shown). These results are consistent with previous empirical studies (e.g. Taqqu et 
al. 1995). 

(6) Trends 

Another important test is to see how the presence of trends affects the various 
estimators. Here we consider three cases: 1) a linear trend superimposed by reali- 
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ARFIMA-G ARFIMA-L ARFIMA-G ARFIMA-L 

Wavelet VB 




Figure 3. Estimates of H for walks derived from ARFIMA-G(l,d,l) and ARFIMA-L(l,d,l) 
with the Wavelet method and Variable Bandwidth estimator VB. Estimates of d for ARFI- 
MA-G(l,d,l) and ARFIMA-L(l,d,l) for the Whittle estimator, R/S, DFA1 and GPH. The 
solid line indicates the true self-similarity parameter H. 



sations of fGn, oLn, ARFIMA-G and ARFIMA-L; and 2) a linear trend only in the 
second half of the time series superimposed by realisations of fGn, oLn, ARFIMA- 
G and ARFIMA-L; 3) a linear trend in the variance. Cases 2 and 3 are motivated 
by climate change where the time series may (case 2) or may not (case 1) include 
the pre-industrial era, and where climate change also influences the frequency and 
strength of fluctuations (e.g. storms; case 3). Based on the evidence gathered so far, 
it is reasonable to expect this can lead to bias any long-range dependence estimate. 

For cases 1 and 2, we assume the magnitude of the linear trend to be 1. The 
empirical tests reveal that the GPH estimator is slightly biased for fGn, ARFIMA- 
G and ARFIMA-L and has a relatively large negative bias for oLn (Figs. [5]). DFA 
is least biased for fGn data but has considerable bias for oLn, ARFIMA-G and 
ARFIMA-L (Figs. EJ) generated data. Both ARFIMA-G and ARFIMA-L estima- 
tors have a large number of outliers. Both R/S and VB show qualitatively similar 
behaviour compared to GPH (not shown). Our results are qualitatively consistent 
with the study by Hu ct al. (2001). 

Now we examine how the estimators handle a trend in the variance (case 3). 
For this purpose we use an AR(1) model with increasing variance: 

x t +i = F + oat + <r(l + looooXt. (4-1) 
Here, again, we generate 1000 realisations by sampling values for F from a uniform 
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Figure 4. Box plots of the difference between the nominal parameter value and the em- 
pirical parameter estimate: a) uncorrelated Gaussian white noise, b) AR(1) and c) ARFI- 
MA-G(1,0,1), d) AR(1) process with linear trend in variance and e) Markovian SDE with 
CAM noise. On each box the central mark is the median, the edges of the boxes are 
the 25th and 75th percentiles, the whiskers extend to the most extreme data points not 
considered to be outliers and outliers are marked individually. 



distribution £7(0,2), using a G E/(— 0.5, 0.5), and a G £7(0,1). For this case DFA is 
the least biased and GPH and R/S show considerable bias while the VB estimators 
have a huge bias, although, their estimates are very narrow (Fig. |3^). 

(c) CAM noise 

Both LFSM and ARFIMA arc additive noise models. In previous studies it has 
been shown that multiplicative noise is important in natural systems (e.g. Majda 
ct al. 2009, Steinbrecher and Wcyssow 2004). This raises the question of what the 
effect of multiplicative noise would be on long-range dependence estimators. To 
investigate, we use data generated from a process which has a so called Correlated 
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Additive and Multiplicative (CAM) noise term. An example is the normal form 
for reduced climate models (Majda et al. 2009), which is given by the following 
Stochastic Differential Equation (SDE): 

dx = (F + ax + bx 2 - cx 3 )dt + a(L + Ix)dW. (4.2) 

As shown in Majda et al. (2009), the PDF of Eq. (|4.2[) exhibits a power-law decay 
over a particular range, although its tail ultimately decays exponentially. We gen- 
erate 1000 realisations by sampling random values for the SDE parameters from an 
uniform distribution while complying with the parameter relations as described in 
Majda et al. (2009). 

The GPH estimator is slightly biased towards positive values while DFA and 
R/S have larger biases. Observe that the Wavelet estimator (as well as both VB 
estimators (not shown)) estimate self-similar behaviour (Fig. 0J). While the PDF 
of Eq. (|4.2p decays in a power-law like way over a given range it is not self-similar 
because its ultimate decay is exponential (Majda et al. 2009). This makes the 
estimates obtained by the wavelet and VB estimators questionable. Furthermore, 
R/S, DFA, GPH and the Whittle estimator again show an uncomfortably large 
number of outliers, again suggesting that the estimators may not be very reliable 
for this case, or that the signal is not characterised simply by H and d. We note that 
Kantelhardt et al. (2002) have studied the performance of more general multifractal 
DFA methods which were found to extract the full range of scaling exponents in a 
particular multifractal test case. 

5. Natural Time Series Examples 

Now we return to the two natural time series from Section 2, and analyse their 
self-similarity and LRD characteristics. We have shown above that the two time 
series are non-Gaussian; are they also long-range dependent or self-similar? 

Applying the various estimation methods to the Faraday- Vernadsky tempera- 
ture time series gives evidence for long-range dependence but with a wide variety 
of values: Whittle d = 0.24, GPH d = 0.28, DFA2 d = 0.43, R/S d = 0.33, Wavelet 
H = 0.53, VB W H = 0.92 and VB 5 H = 0.96. While the three LRD estimators all 
provide evidence for long range dependence, they provide a rather large range of 
values of the LRD parameter, d. However, all of the estimates are between and 
0.5, suggesting a long-range dependent but stationary process. The H estimates are 
larger than the d estimates; this might suggest self-similarity with a « 1.5. There is 
also evidence for long-range dependence in the AE index, again with a wide variety 
of estimates from the various estimators: Whittle d = 0.28, GPH d = 0.72, DFA2 
d = 0.66, R/S d = 0.4, Wavelet H = 0.94, VB™ H = 0.97 and VB S H = 0.99. The 
estimates from GPH and DFA2 are larger than 0.5, suggesting a non- stationary 
time series, while the Whittle and R/S estimates suggests a stationary long-range 
dependent time series. Again the H estimates are larger than the d estimates, pro- 
viding some evidence for self-similarity. Other evidence has been presented that 
AE may not be a simple fractal (Consolini et al. 1996); recently work has focused 
on high-frequency non-stationary and lower-frequency 1/f properties (Rypdal and 
Rypdal 2010). 

While all LRD estimators agree that there is evidence for LRD in these two time 
series they provide a rather large range of possible values, illustrating the problem 
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Figure 5. Box plots of the difference between the nominal parameter value and the empir- 
ical parameter estimate a) GPH estimator for noise plus linear trend, b) GPH estimator 
for noise plus linear trend in second half of time series, c) DFA2 estimator for noise plus 
linear trend and d) DFA2 estimator for noise plus linear trend in second half of time series. 
On each box the central mark is the median, the edges of the boxes are the 25th and 75th 
percentiles, the whiskers extend to the most extreme data points not considered to be 
outliers and outliers are marked individually. 



of statistically robust LRD estimation in practice and the need for further inves- 
tigation of the performance of statistical indicators in the presence of departures 
from fractality, including possible multifractality. 

6. Conclusions 

There are two contributions to self-similarity: (i) long-range dependence and (ii) 
non-Gaussian jumps. This is not always appreciated in the various communities 
with interests in detecting self-similarity and long-range dependence. We have 
shown that empirical estimators of long-range dependence are at best biased for 
fractional Gaussian noise, but at worst not robust for processes which deviate from 
this idealised model. We have also shown that the empirical estimators are not very 
robust in the presence of trends and multiplicative noise. 

Our results have several important implications for the modelling of natural 
time series. In our view the ARFIMA model is a much better paradigmatic model 
of natural time series than fGn since it allows one to explicitly model short-range 
and long-range behaviour while also allowing for non-Gaussian increments. 

Finally, there is a need for estimation procedures which can deal with multi- 
plicative noise and trends in the variance. Such effects introduce sizeable biases 
and estimation uncertainty. As such, all estimations of LRD have to be taken with 
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precaution. While it is true that all of the estimators we tested perform reason- 
ably well for fractional Gaussian noise, once a time series is non-Gaussian or is 
non-stationary (in trend or volatility) the estimators can be problematic. 
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